Neural Embeddings for Populated Geonames Locations
نویسندگان
چکیده
The application of neural embedding algorithms (based on architectures like skip-grams) to large knowledge bases like Wikipedia and the Google News Corpus has tremendously benefited multiple communities in applications as diverse as sentiment analysis, named entity recognition and text classification. In this paper, we present a similar resource for geospatial applications. We systematically construct a weighted network that spans all populated places in Geonames. Using a network embedding algorithm that was recently found to achieve excellent results and is based on the skip-gram model, we embed each populated place into a 100-dimensional vector space, in a similar vein as the GloVe embeddings released for Wikipedia. We demonstrate potential applications of this dataset resource, which we release under a public license. Resource Type. Datasets generated using novel methods/algorithms. Github. https://github.com/mayankkejriwal/Geonames-embeddings Figshare/DOI. https://doi.org/10.6084/m9.figshare.5248120 License. MIT License
منابع مشابه
Surveying GeoNames Gazetteer Data for the Nordic Countries
This paper takes a look at freely available gazetteer data for the Nordic countries. We examine locations in this region to understand their characteristics and the quality of the available data. Several indicators are developed and discussed to estimate the expected data quality. The distribution and coverage of the data is mapped and the accuracy and quality indicators are visualized. The use...
متن کاملIs Small More Interesting? Examining Countries' GeoNames Places linked to Wikipedia
Following up on previous analyses [1], we examine the geospatial and thematic data in GeoNames [2]. It is the largest freely available gazetteer – a geographical thesaurus – with a worldwide coverage. One measure of interestingness for a country can be its number of populated places represented as pages in Wikipedia. In GeoNames, on average only 20% of populated places, i.e., cities, towns, vil...
متن کاملA Semantic Schema for Geonames
As part of a broader strategy towards supporting semantic interoperability in geospatial applications, in this paper we present a semantic schema we designed for GeoNames and the qualitative improvements we obtained by enforcing it on the data. Introduction. GeoNames (www.geonames.org) is a well-known geospatial dataset providing geographical data and metadata of around 7 million unique named p...
متن کاملScalable Generation of Type Embeddings Using the ABox
Structured knowledge bases gain their expressive power from both the ABox and TBox. While the ABox is rich in data, the TBox contains the ontological assertions that are often necessary for logical inference. The crucial links between the ABox and the TBox are served by is-a statements (formally a part of the ABox) that connect instances to types, also referred to as classes or concepts. Latent...
متن کاملPose-Driven Deep Models for Person Re-Identification
Person re-identification (re-id) is the task of recognizing and matching persons at different locations recorded by cameras with non-overlapping views. One of the main challenges of re-id is the large variance in person poses and camera angles since neither of them can be influenced by the re-id system. In this work, an effective approach to integrate coarse camera view information as well as f...
متن کامل